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SYSTEM INTERRUPTS 


I. Overview 


This article is being written to provide a better understanding of system interrupts. The various types 
of system interrupts are discussed and steps are outlined which will enable you, the customer, to 
provide Hewlett-Packard with information that is useful in determining the cause of the interrupt. 


A feature in the North American Response Centers is the System Interrupt Team. This team is 
composed of highly trained software and hardware engineers who are experienced in troubleshooting 
system interrupts. When you call the Response Center to report a system interrupt, the problem will 
be assigned to one of these engineers. The engineer will then call you as soon as possible (often within 
15 minutes) and begin the investigation of the system interrupt. The European Response Centers are 
putting in place like programs. 


Types of System Interrupts 


A system interrupt can be defined as a condition which prevents all processes on an HP 3000 system 
from executing. There are various types of system interrupts. Those which are discussed here are 
SYSTEM FAILURES, SYSTEM HALTS, HALTS, HANGS, SYSSTOP conditions, and POWER 
FAILURES. . 


SYSTEM FAILURE 


A system failure occurs when HP software detects an abnormal condition which is considered to be a 
threat to data integrity. The procedure which handles system failures in MPE is SUDDENDEATH. 
SUDDENDEATH will print the following information on the console and halt the system (this 
information can also be found on P. 9-2 of the System Operation and Resource Management manual 
(SORM)) (p/n 32033-90005): 


*kEESYSTEM FAILURE #enum 
STATUS snum 
DELTAP pnum 


where: 
ENUM is the error number that identifies the type 


SNUM is the code segment number from which the system failure was called 
PNUM is the program counter (Delta-P) offset into the code segment 


A system failure list can be found in Table 9-1 of the SORM. 


SYSTEM HALT 


A system. halt occurs when HP Microcode (not HP software) detects an abnormal condition which is 
considered to be a threat to data integrity. 


_ On Series 64/68/70 and Series 37 systems, you'll see the following message displayed on the console: - 
SYSTEM HALT condition 


where CONDITION is the text which corresponds to the system halt numbers listed in Table 11-8 of 
the SORM. 


On other systems, you'll see the following message displayed on the console: 
SYSTEM HALT nn 


where NN is the system halt number. Table 11-8 in the SORM will tell the meaning of the system 
halt number. 


HALT 


A HALT is not the same as a SYSTEM HALT. HALT is an MPE machine instruction which is 
executed by HP software when an abnormal condition is detected which is considered to be a threat to 
data integrity. . 


On Series 64/68/70 systems, the HALT light will be lit on the CPU, the maintenance prompt ("M>" or 
"C>") will be displayed on the console, and the halt number will be displayed in the banner on the 
console. 


On other systems, the HALT light will be lit on the CPU and the following message will be displayed 
on the console: 


HALT nn 


where NN is the halt number (0-15). Since the HALT instruction can be executed by any software 
running in Privileged Mode (including non-HP software) and any number between | and 15 can be 
supplied as the halt number, no list of halt numbers and their meanings can be developed by HP. 


NOTE 


For systems other than Series 64/68/70, it is very important to use the 
appropriate terminology when speaking with HP engineers about 
SYSTEM HALTs and HALTs. If the system is interrupted with a 
SYSTEM HALT 3, for instance, tell the HP engineer "The system failed : 
with a SYSTEM HALT 3" rather than "The system failed with a HALT 
3," 


HANG 


A hang is characterized by an inability to obtain colon prompts on any terminals. The RUN light is lit 
on the CPU and there is no system failure, system halt, halt, or sysstop error message on the console. 


When the system is hung, it is a good idea to check all disc drives to make sure they are ready and not 
reporting drive faults. If a disc. drive is reporting a drive fault, mention this to the HP engineer when 
he contacts you but do nothing to correct the problem until you have spoken with an HP engineer. If 


a disc is not ready, check the HPIB and power cables to see if either has been disconnected. If this is 
the case, reattach the cable and call the Response Center if the system is still hung after doing so. 


SYSSTOP 


Referring to P. 11-36 of the SORM, SYSSTOPs "indicate a specific hardware problem as detected by 
the DCU during normal startup and system operation. These errors are referred to as DCU hardware 
halts, but when these halts occur, the DCU enters the Maintenance Mode. Some of the errors can be 
caused by software, usually an address to non-existent memory. This forces an "INVALID ADDRESS" 
error message. Other errors can be forced by bad hardware, such as a double-bit memory error (an 
uncorrectable memory error)." SYSSTOPs will only occur on Series 64/68/70 systems. Refer to Table 
11-7 in the SORM for a list of SYSSTOP error messages. 


POWER FAILURE 


A power failure occurs when the power supplied to the CPU drops below a preset value. At the time 
of a power failure, the system will battery back up memory for a time period dependent upon memory 
size, i/o configuration, and the condition of the battery, in order that normal operation can resume 
when sufficient power returns. A successful power failure recovery is noted by the following message 
on all terminals logged on at the time of the power failure: . 


**k*e* POWER FAILURE **** 


In this case, it is not necessary to report the power failure to the Response Center since the system 
recovered successfully. 


If power returns in a state of flux, it is possible for the system to hang during power failure recovery. 
Call the Response Center when this happens. 


If power is off for an extended period of time, the battery backup will expire. When power returns, 
the HALT light is lit on the CPU and the maintenance prompt is be displayed on the console (the 
maintenance prompt is "->" for Series 4X/5X systems and "H FOR HELP" for Series 37 systems). You 
should perform a WARMSTART and you should treat the power failure as a system interrupt with 
respect to the standard recovery that is performed after the system is restarted IMAGE, KSAM, etc.). 
You do not need to call the Response Center to report the power failure if the battery backup expires 
since this is not an abnormal condition. 


iil. Capturing Useful Information 


For system interrupts other than power failures, it is important to capture information which can be 
useful in identifying the cause of the interrupt. 


On Series 64/68/70 systems, it is important to capture the contents of CPU registers when a system 
halt or SYSSTOP condition occurs. You can do this by performing a string dump. If you perform a 
string dump, it MUST be performed before a memory dump (discussed later) is performed. Tf not, the 
string dump will contain no useful information. Due to the complexity of the string dump procedure 
and the differences in taking a string dump on different CPU types, please refer to appendix A of the 
SORM for string dump instructions. If you have any questions on this procedure, please contact your 
account Customer Engineer (CE). . 


For system failures, system halts, halts, hangs, and "INVALID ADDRESS" SYSSTOP conditions, it is 
important to capture the contents of main memory by performing a memory dump. A memory dump 
is performed as follows: 


e If the system is hung, get to the maintenance prompt by pressing the "CTRL" and "B" keys 
simultaneously and enter "HALT". 


e Mount a scratch tape with a write ring on the tape drive which has been configured with device 
class DDUMP (it may be useful to always have a printed copy of your i/o configuration by the 
console). 
e Enter "DUMP". 
© You will then see the following messages displayed on the console: 

* * * SOFTWARE DUMP FACILITY (VER XX.XX/XX) * * * 


MOUNT DUMP MEDIA, AND PLACE DRIVE ON-LINE. 
PRESS THE RETURN KEY TO CONTINUE EXECUTION OF SOFTDUMP. 


e At this point, press return and the memory dump will be performed. 


IV. Recovery Steps 
1) Perform a string dump if the interrupt was a system halt or SYSSTOP condition and your system is 
a Series 64/68/70. If the system halted with a LUT PARITY ERROR or WCS PARITY ERROR, 
however, it is useful to call the Response Center while the system is down and wait for the engineer’s 
callback before doing anything. This is because the engineer can determine the cause of the problem 
through the use of DCU commands, thereby eliminating the need for a string dump. 


2) Perform a memory dump (if the system is hung, first get the maintenance prompt by pressing the 
"CTRL" and "B" keys simultaneously on the console and enter "HALT". 


3) Perform a WARMSTART. 
4) Print or copy spoolfiles, if necessary, then delete them to avoid losing disc space. 


5) Perform a COOLSTART, or check for free space (LARGEST FREE AREA of 17000 or more) on 
logical device 1 and perform a COLDSTART or UPDATE from your current coldload tape. 


6) Call the Response Center to report the interrupt. 


7) Gather information (modem phone number, MGR.TELESUP passwords, and any other security 
passwords) in the event it is necessary for the Response Center engineer to dial into the system. 


8) Text the dump onto disc by doing the following: » 


a) Find the latest version of IDAT by running IDATS.PUB.SYS and IDAT.PRV.TELESUP, checking 
the date displayed in the banner. 


b) RUN the latest version. 
c) Text the dump onto disc by entering: 


T filename, TAPE 


"filename" should represent the type of interrupt (for example, SF16) and can be up to 7 characters 
in length. 


Here’s a flowchart that may simplify the recovery steps mentioned above: 


| LUT or wCS | NO | SYSTEM | NO. --------- NO err nnn reer rere rr ne NO 
| PARITY |------ >| FAILURE |---->| HANG? |---->]| Series 64/68/70? |----> 
| ERROR? | | or HALT? | 2 week ennn- 0 nnn - eee ------------- | 
AERRSSeR SRS 8 Mew eee oases | | | 
| | | YES | YES | 
| | \l/ \l/ | 
| YES | YES wnnwnwweeeeee-- 0 ---- +--+ ------- | 
| | | HALT system | {| STRING DUMP | | 
| | nan eeeewneeeeen nonce nennnnnne | 
| | | | | 
| | Shas | | 
\l/ \l/ LE | | 
BepeeGrennaeee “sine SeSresares \l/ \l/ 
| STEP 6 ONLY | | MEMORY DUMP |<------------------------------------ 
| 
| 
| 
MY 


V. Conclusion 


You should now have a better understanding of system interrupts. Also, you should now be able to 
provide HP with information that is useful in determining the cause of the interrupt. This will enable 
us to offer the appropriate resolution in a timely manner. Finally, for your benefit, please report 
every system interrupt to the Response Center at the time of its occurrence. 
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